{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Selecting Subsets of Data in Pandas\n",
    "\n",
    "This notebook is also available as a blog post on Medium.\n",
    "\n",
    "## Part 4: How NOT to select subsets of data\n",
    "This is part 4 of the series \"How to Select Subsets of Data\" from a pandas DataFrame or Series. Pandas offers a wide variety of options for subset selection, which necessitates multiple articles. This series is broken down into the following topics.\n",
    "\n",
    "1. Selection with `[]`, `.loc` and `.iloc`\n",
    "1. Boolean indexing\n",
    "1. Assigning subsets of data\n",
    "1. How NOT to select subsets of data\n",
    "1. Selection with a MultiIndex\n",
    "1. Selecting subsets of data with methods\n",
    "1. Selections with other Index types\n",
    "1. Internals, Miscellaneous, and Conclusion\n",
    "\n",
    "## Learning what not to do\n",
    "In all programming languages, and especially pandas, the number of incorrect or inefficient ways to complete a task heavily outnumbers the efficient or **idiomatic** ones. The term idiomatic refers to code that is efficient, easy to understand, and a common way (among experts) to accomplish a task in a particular library/language. \n",
    "\n",
    "The first three parts of this series showed the idiomatic way of making selections. In this section, we will cover the most common ways users incorrectly make subset selections. Some of these bad habits might be perfectly acceptable in other Python libraries, but will be unacceptable with pandas.\n",
    "\n",
    "## Getting the right answer with the wrong code\n",
    "One of the issues that prevents users from learning idiomatic pandas, is that it is still possible to get the correct final result while using highly inefficient and non-idiomatic code. Completing a task isn't necessarily indicative that your code is correct.\n",
    "\n",
    "Here are a few reasons why a solution that gives the correct result might not be good:\n",
    "\n",
    "* A slow solution might not scale to larger data\n",
    "* A solution might work in this particular instance but fail with slightly different data\n",
    "* A solution might be very fast, but hard to interpret by others\n",
    "\n",
    "## Chained indexing with lists\n",
    "**Chained indexing** is the first and most important subset selection problem we will discuss. Chained indexing occurs whenever two subset selections immediately following each other. \n",
    "\n",
    "To help simplify the idea, we will look at chained indexing with Python lists. Let's first create a list of integers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 5, 10, 3, 99, 5, 8, 20, 40]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = [1, 5, 10, 3, 99, 5, 8, 20, 40]\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's make a normal subset selection by slicing from integer location 2 to 6."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[10, 3, 99, 5]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[2:6]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Chained indexing occurs whenever we make another subset selection immediately following this one. Let's select the first element from this new list in a single line of code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[2:6][0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is an example of chained indexing.\n",
    "\n",
    "## Assigning a new value to a list with chained indexing\n",
    "Let's say, we wanted to change this value that was selected from above from 10 to 50. Let's attempt to do this assignment with chained indexing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 5, 10, 3, 99, 5, 8, 20, 40]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[2:6][0] = 50\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Nothing happened???\n",
    "As you can see, the list **`a`** was not modified at all. The reason for this, is that Python created a temporary intermediate list directly after the first subset selection. It might be easier to write out the execution of the above into two separate steps:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[50, 3, 99, 5]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_temp = a[2:6]\n",
    "a_temp[0] = 50\n",
    "a_temp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 5, 10, 3, 99, 5, 8, 20, 40]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The temporary object was the only one modified\n",
    "As you can see, the intermediate object, **`a_temp`**, was the only object modified. The original was left untouched.\n",
    "\n",
    "## But doesn't Python modify objects that are the 'same'?\n",
    "Let's take a look at a closely related example where Python will modify two variables at the 'same' time. Let's create a new list, **`a1`** and set it equal to **`b1`** and then modify the first element of it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[99, 1, 2, 3]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a1 = [0, 1, 2, 3]\n",
    "b1 = a1\n",
    "b1[0] = 99\n",
    "b1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[99, 1, 2, 3]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Both have changed!?!\n",
    "In the above example, we made a single call to change the first element of **`b1`**. This modified both **`b1`** and **`a1`**. This happened because **`a1`** and **`b1`** are referring to the exact same object in memory. **`a1`** and **`b1`** are simply **names** that are used to refer to the underlying objects, which in this case, are the same.\n",
    "\n",
    "## Proof they are the same with the `id` function\n",
    "We can prove that **`a1`** and **`b1`** are referring to the same object with the built-in **`id`** function, which returns the memory address of the object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4501625608"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(a1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4501625608"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(b1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(a1) == id(b1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## So, why did our assignment with chained indexing fail?\n",
    "Whenever you take a slice of a list, Python creates a brand new **copy** (a shallow-copy to be exact) of the data. A copy of an object is completely unrelated to the original and has it's own place in memory. \n",
    "\n",
    "Whenever we write **`a[2:6]`**, the result of this is a brand new list object in memory unrelated to the list **`a`**. The statement **`a[2:6][0] = 50`** does actually make an assignment to that temporary list copy, but it is not saved to a variable, so there is no way to track it.\n",
    "\n",
    "To properly assign the 2nd position in the list **`a`** to 50, you would simply do **`a[2] = 50`** instead of chained indexing.\n",
    "\n",
    "## Shallow vs Deep Copy (Advanced)\n",
    "You can safely skip this section as it won't be relevant to our subset selection. Python creates a **shallow copy** when performing a slice on a list. If you have mutable objects within your list, then these inner list objects won't be copied and will still be referring to the same object.\n",
    "\n",
    "Let's create a list with a another list inside of it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[7, [1, 2], 5, 6, 10, 14, 19, 20]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = [7, [1, 2], 5, 6, 10, 14, 19, 20]\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a slice of this list: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2], 5, 6]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_slice = a[1:4]\n",
    "a_slice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the **`id`** of the inner list from both **`a`** and **`a_slice`**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(a[1]) == id(a_slice[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "They are the exact same! Python has created a **shallow copy** here, meaning every mutable object inside of the slice is still the same as it was in the original.\n",
    "\n",
    "## Making an assignment to the inner list\n",
    "Let's change the first value in the inner list of **`a_slice`** and see if it changes the inner list of **`a`**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inner_list = a_slice[0]\n",
    "inner_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "inner_list[0] = 99"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[99, 2], 5, 6]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a_slice"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[7, [99, 2], 5, 6, 10, 14, 19, 20]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The inner list for both variables had its first element changed. That inner list never copied when taking that first slice and therefore it exists in only one unique place in memory.\n",
    "\n",
    "## Chained indexing assignment in one step\n",
    "We can modify this inner list in a single chain of indexing in one line of code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[7, [99, 2], 5, 6, 10, 14, 19, 20]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# output our current list\n",
    "a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[7, [1000, 2], 5, 6, 10, 14, 19, 20]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a[1:5][0][0] = 1000\n",
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the copy module to create a deep copy\n",
    "The standard library comes with the [copy module](https://docs.python.org/3/library/copy.html) to make a **deep copy** of your object. A deep copy creates a copy of every single mutable object within your object.\n",
    "\n",
    "Let's re-run the code from a couple sections above where we take a slice of a list containing an inner list and this time make a deep copy before checking the **`id`** of each inner list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import copy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = [7, [1, 2], 5, 6, 10, 14, 19, 20]\n",
    "a_slice = copy.deepcopy(a[1:4])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(a[1]) == id(a_slice[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Chained Indexing in pandas\n",
    "Chained indexing happens analogously to pandas DataFrames/Series. Whenever you do two (or more) subset selections one after the other, you are doing chained indexing. Note, that this isn't 100% indicative that you are doing something wrong but for the vast majority of cases that I have seen, it is.\n",
    "\n",
    "Let's walk through several examples of chained indexing on a pandas DataFrame. To simplify matters, we will use some fake data on a small DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120    9.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 1\n",
    "Let's select the columns **`food`**, **`age`**, and **`color`** and then immediately select just **`age`** using chained indexing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         30\n",
       "Niko          2\n",
       "Aaron        12\n",
       "Penelope      4\n",
       "Dean         32\n",
       "Christina    33\n",
       "Cornelia     69\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[['food', 'age', 'color']]['age']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It might be easier to store each selection to a variable first:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         30\n",
       "Niko          2\n",
       "Aaron        12\n",
       "Penelope      4\n",
       "Dean         32\n",
       "Christina    33\n",
       "Cornelia     69\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = ['food', 'age', 'color']\n",
    "b = 'age'\n",
    "df[a][b]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 2\n",
    "\n",
    "Let's use **`.loc`** to select **`Niko`** and **`Dean`** along with **`state`**, **`height`**, and **`color`**. Then, let's chain *just the indexing operator* to select **`height`** and **`color`**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>height</th>\n",
       "      <th>color</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>70</td>\n",
       "      <td>green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>180</td>\n",
       "      <td>gray</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      height  color\n",
       "Niko      70  green\n",
       "Dean     180   gray"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Niko', 'Dean'], ['state', 'height', 'color']][['height', 'color']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's a lot of brackets in the above expression. Let's separate each selection into their own variables. Below, the variable **`a`** is technically a two-item tuple of lists."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>height</th>\n",
       "      <th>color</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>70</td>\n",
       "      <td>green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>180</td>\n",
       "      <td>gray</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      height  color\n",
       "Niko      70  green\n",
       "Dean     180   gray"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = ['Niko', 'Dean'], ['state', 'height', 'color']\n",
    "b = ['height', 'color']\n",
    "\n",
    "df.loc[a][b]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 3\n",
    "\n",
    "Let's use **`.iloc`** first to select the rows 2 through 5 and then chain it again to select the last three columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          age  height  score\n",
       "Aaron      12     120    9.0\n",
       "Penelope    4      80    3.3\n",
       "Dean       32     180    1.8"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[2:5].iloc[:, -3:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 4\n",
    "Let's select the rows **`Aaron`**, **`Dean`**, and **`Christina`** with **`.loc`** and then the columns **`age`** and **`food`** with just the indexing operator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>food</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>12</td>\n",
       "      <td>Mango</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>32</td>\n",
       "      <td>Cheese</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>33</td>\n",
       "      <td>Melon</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           age    food\n",
       "Aaron       12   Mango\n",
       "Dean        32  Cheese\n",
       "Christina   33   Melon"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Aaron', 'Dean', 'Christina']][['age', 'food']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 5\n",
    "Select all the rows with age greater than 10 with *just the indexing operator* and then select the **`score`** column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         4.6\n",
       "Aaron        9.0\n",
       "Dean         1.8\n",
       "Christina    9.5\n",
       "Cornelia     2.2\n",
       "Name: score, dtype: float64"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['age'] > 10]['score']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Identifying Chained Indexing\n",
    "First, all the examples from above are things you should strive to avoid. All the selections from above could have been reproduced in a much simpler and more direct manner.\n",
    "\n",
    "As mentioned previously, chained indexing occurs whenever you use the indexers **`[]`**, **`.loc`**, or **`.iloc`** twice in a row.\n",
    "\n",
    "If you are having trouble identifying chained indexing you can look for the following:\n",
    "* A closed bracket followed by an open bracket - Look for **`][`** as with **`df[a][b]`** \n",
    "* **`.loc`** or **`.iloc`** following a closed bracket like in example 3: **`df.iloc[2:5].iloc[:, -3:]`**\n",
    "\n",
    "Another way to determine if you have chained indexing is if you can break the operation up into two lines. For instance, **`df[a][b]`** can be broken up into:\n",
    "\n",
    "```Python\n",
    ">>> df1 = df[a]\n",
    ">>> df1[b]\n",
    "```\n",
    "\n",
    "Thanks to Tom Augspurger for the first bullet. See his [blog post on indexing](https://tomaugspurger.github.io/modern-1-intro.html) for more.\n",
    "\n",
    "## Making the examples idiomatic\n",
    "Let's re-write all of the above examples idiomatically."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 1 - Idiomatic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         30\n",
       "Niko          2\n",
       "Aaron        12\n",
       "Penelope      4\n",
       "Dean         32\n",
       "Christina    33\n",
       "Cornelia     69\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# df[['food', 'age', 'color']]['age'] - bad\n",
    "df['age']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 2 - Idiomatic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>height</th>\n",
       "      <th>color</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>70</td>\n",
       "      <td>green</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>180</td>\n",
       "      <td>gray</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      height  color\n",
       "Niko      70  green\n",
       "Dean     180   gray"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# df.loc[['Niko', 'Dean'], ['state', 'height', 'color']][['height', 'color']] - bad\n",
    "df.loc[['Niko', 'Dean'], ['height', 'color']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 3 - Idiomatic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          age  height  score\n",
       "Aaron      12     120    9.0\n",
       "Penelope    4      80    3.3\n",
       "Dean       32     180    1.8"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# df.iloc[2:5].iloc[:, -3:] - bad\n",
    "df.iloc[2:5, -3:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 4 - Idiomatic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>food</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>12</td>\n",
       "      <td>Mango</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>32</td>\n",
       "      <td>Cheese</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>33</td>\n",
       "      <td>Melon</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           age    food\n",
       "Aaron       12   Mango\n",
       "Dean        32  Cheese\n",
       "Christina   33   Melon"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# df.loc[['Aaron', 'Dean', 'Christina']][['age', 'food']] - bad\n",
    "df.loc[['Aaron', 'Dean', 'Christina'], ['age', 'food']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Chained Indexing Example 5 - Idiomatic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         4.6\n",
       "Aaron        9.0\n",
       "Dean         1.8\n",
       "Christina    9.5\n",
       "Cornelia     2.2\n",
       "Name: score, dtype: float64"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# df[df['age'] > 10]['score'] - bad\n",
    "df.loc[df['age'] > 10, 'score']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Why is chained indexing bad?\n",
    "There are two primary reasons that chained indexing should be avoided if possible. \n",
    "\n",
    "#### Two separate operations\n",
    "The first, and less important reason, is that two separate pandas operations will be called instead of just one.\n",
    "\n",
    "Let's take example 4 from above:\n",
    "\n",
    "```Python\n",
    ">>> df.loc[['Aaron', 'Dean', 'Christina']][['age', 'food']]\n",
    "```\n",
    "When this code is executed, two independent operations are completed. The following is run first:\n",
    "\n",
    "```Python\n",
    ">>> df.loc[['Aaron', 'Dean', 'Christina']]\n",
    "```\n",
    "\n",
    "The result of this is a DataFrame, and on this temporary and intermediate DataFrame the second and final operation is run to select two columns: **`[['age', 'food']]`**.\n",
    "\n",
    "Let's see this operation written idimoatically:\n",
    "\n",
    "```Python\n",
    ">>> df.loc[['Aaron', 'Dean', 'Christina'], ['age', 'food']]\n",
    "```\n",
    "\n",
    "There is exactly one operation, a call to the **`.loc`** indexer that is passed both the row and column selections simultaneously.\n",
    "\n",
    "#### `SettingWithCopy` warning on assignment\n",
    "The major problem with chained indexing is when assigning new values to the subset, in which Pandas will usually emit the **`SettingWithCopy`** warning.\n",
    "\n",
    "Let's use example 5 from above with its chained indexing version to change all the scores of those older than 10 to 99."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    }
   ],
   "source": [
    "df[df['age'] > 10]['score'] = 99"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **`SettingWithCopy`** warning was triggered. Let's output the DataFrame to see if the assignment happened correctly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120    9.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Failed Assignment!\n",
    "Our DataFrame failed to make the assignment. Let's break this operation up into two steps to give us more insight into what is happening."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \n"
     ]
    }
   ],
   "source": [
    "df_temp = df[df['age'] > 10]\n",
    "df_temp['score'] = 99"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take a look at **`df_temp`**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>99</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165     99\n",
       "Aaron        FL    red   Mango   12     120     99\n",
       "Dean         AK   gray  Cheese   32     180     99\n",
       "Christina    TX  black   Melon   33     172     99\n",
       "Cornelia     TX    red   Beans   69     150     99"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_temp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The assignment completed correctly for the intermediate DataFrame but not for our original. The reason for this, is the same reason as to why the chained indexing did not work for the list at the top of this tutorial.\n",
    "\n",
    "When we run **`df[df['age'] > 10]`**, pandas creates an entire new **copy** of the data. So when we try and assign the **`score`** column, we are modifying this new copy and not the original. Thus, the name **`SettingWithCopy`** make sense: pandas is warning you that you are setting (making an assignment) on a copy of a DataFrame.\n",
    "\n",
    "## How to assign correctly?\n",
    "You should never use chained indexing to make an assignment. Instead, make exactly a single call to one of the indexers. In this case, we can use **`.loc`** to properly make the selection and assignment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>99.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165   99.0\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120   99.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180   99.0\n",
       "Christina    TX  black   Melon   33     172   99.0\n",
       "Cornelia     TX    red   Beans   69     150   99.0"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[df['age'] > 10, 'score'] = 99\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SettingWithCopy example that does assignment\n",
    "Let's do another nearly identical chained indexing as the previous example, except we will reverse the order of the chain. We will first select the **`score`** column and use boolean indexing to choose the people older than 10 and assign them a score of 0.\n",
    "\n",
    "First we will, just make the selection (without assignment) so you can see what we are trying to assign."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         99.0\n",
       "Aaron        99.0\n",
       "Dean         99.0\n",
       "Christina    99.0\n",
       "Cornelia     99.0\n",
       "Name: score, dtype: float64"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['score'][df['age'] > 10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's make the assignment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    }
   ],
   "source": [
    "df['score'][df['age'] > 10] = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The warning is triggered again. Let's output the DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    0.0\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120    0.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    0.0\n",
       "Christina    TX  black   Melon   33     172    0.0\n",
       "Cornelia     TX    red   Beans   69     150    0.0"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What happened this time?\n",
    "The first part of the above operation selects the **`score`** column. When pandas selects a single column from a DataFrame, pandas creates a **view** and not a **copy**. A view just means that no new object has been created. **`df['score']`** references the **`score`** column in the original DataFrame.\n",
    "\n",
    "This is analogous to the list example where we assigned an entire list to a new variable. No new object is created, just a new reference to the one already in existence.\n",
    "\n",
    "Since no new data has been created, the assignment will modify the original DataFrame.\n",
    "\n",
    "## Why is a warning triggered when our operation completed successfully?\n",
    "Pandas does not know if you want to modify the original DataFrame or just the first subset selection.\n",
    "\n",
    "For instance, you could have selected the **`score`** column as a Series to do further analysis with it without affecting the original DataFrame.\n",
    "\n",
    "Let's get a fresh read of our data and see this example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         4.6\n",
       "Niko         8.3\n",
       "Aaron        9.0\n",
       "Penelope     3.3\n",
       "Dean         1.8\n",
       "Christina    9.5\n",
       "Cornelia     2.2\n",
       "Name: score, dtype: float64"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = df['score']\n",
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's set all the values of scores that are greater than 5 to 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    }
   ],
   "source": [
    "s[s > 5] = 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Why was the warning triggered here?\n",
    "This last assignment does not use chained indexing. But, the variable **`s`** was created from a subset selection of a DataFrame. So it's really not any different than doing the following:\n",
    "\n",
    "```Python\n",
    ">>> df['score'][df['score'] > 5] = 0\n",
    "```\n",
    "\n",
    "Pandas can't tell the difference between an assignment like this in a single line versus one on multiple lines. Pandas doesn't know if you want the original DataFrame modified or not. Let's take a look at both **`s`** and **`df`** to see what has happened."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         4.6\n",
       "Niko         0.0\n",
       "Aaron        0.0\n",
       "Penelope     3.3\n",
       "Dean         1.8\n",
       "Christina    0.0\n",
       "Cornelia     2.2\n",
       "Name: score, dtype: float64"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**`s`** was modified as expected. But, what about our original?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    0.0\n",
       "Aaron        FL    red   Mango   12     120    0.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    0.0\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our original DataFrame has been modified, which means that **`s`** is a view and not a copy.\n",
    "\n",
    "## Why is the warning message so useless?\n",
    "Let's take a look at what the warning says:\n",
    "\n",
    "> **A value is trying to be set on a copy of a slice from a DataFrame**\n",
    "\n",
    "I'm not sure what **copy of a slice** actually means but it isn't what we had in the previous example. **`s`** was a view of a column of a DataFrame and not a copy.\n",
    "\n",
    "## What the warning should really say\n",
    "A better message would look something like this:\n",
    "\n",
    "> **You are attempting to make an assignment on an object that is either a view or a copy of a DataFrame. This occurs whenever you make a subset selection from a DataFrame and then try to assign new values to this subset.**\n",
    "\n",
    "## Summary of when the SettingWithCopy warning is triggered\n",
    "In summary, whenever you make a subset selection and then modify the values in that subset selection, you will likely trigger the **`SettingWithCopy`** warning.\n",
    "\n",
    "It might help to see one more example of when the **`SettingWithCopy`** is triggered. \n",
    "\n",
    "Let's begin by selecting two columns from **`df`** into a new variable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>color</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>blue</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>green</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>red</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>white</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>gray</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>black</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>red</td>\n",
       "      <td>69</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           color  age\n",
       "Jane        blue   30\n",
       "Niko       green    2\n",
       "Aaron        red   12\n",
       "Penelope   white    4\n",
       "Dean        gray   32\n",
       "Christina  black   33\n",
       "Cornelia     red   69"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = df[['color', 'age']]\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's display the **`age`** column from this new DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Jane         30\n",
       "Niko          2\n",
       "Aaron        12\n",
       "Penelope      4\n",
       "Dean         32\n",
       "Christina    33\n",
       "Cornelia     69\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1['age'] # no warning for outputing, there is no assignment here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's add a new column **`weight`**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    }
   ],
   "source": [
    "df1['weight'] = [150, 30, 120, 40, 200, 130, 144]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We triggered the warning because **`df1`** is a subset selection from **`df`** and we subsequently modified it by adding a new column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>color</th>\n",
       "      <th>age</th>\n",
       "      <th>weight</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>blue</td>\n",
       "      <td>30</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>green</td>\n",
       "      <td>2</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>red</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>white</td>\n",
       "      <td>4</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>gray</td>\n",
       "      <td>32</td>\n",
       "      <td>200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>black</td>\n",
       "      <td>33</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>red</td>\n",
       "      <td>69</td>\n",
       "      <td>144</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           color  age  weight\n",
       "Jane        blue   30     150\n",
       "Niko       green    2      30\n",
       "Aaron        red   12     120\n",
       "Penelope   white    4      40\n",
       "Dean        gray   32     200\n",
       "Christina  black   33     130\n",
       "Cornelia     red   69     144"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The original is left unchanged."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    0.0\n",
       "Aaron        FL    red   Mango   12     120    0.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    0.0\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's continue and change all the ages to 99, which will again trigger the warning."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    }
   ],
   "source": [
    "df1['age'] = 99"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>color</th>\n",
       "      <th>age</th>\n",
       "      <th>weight</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>blue</td>\n",
       "      <td>99</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>green</td>\n",
       "      <td>99</td>\n",
       "      <td>30</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>red</td>\n",
       "      <td>99</td>\n",
       "      <td>120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>white</td>\n",
       "      <td>99</td>\n",
       "      <td>40</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>gray</td>\n",
       "      <td>99</td>\n",
       "      <td>200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>black</td>\n",
       "      <td>99</td>\n",
       "      <td>130</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>red</td>\n",
       "      <td>99</td>\n",
       "      <td>144</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           color  age  weight\n",
       "Jane        blue   99     150\n",
       "Niko       green   99      30\n",
       "Aaron        red   99     120\n",
       "Penelope   white   99      40\n",
       "Dean        gray   99     200\n",
       "Christina  black   99     130\n",
       "Cornelia     red   99     144"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The original is also left unchanged:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    0.0\n",
       "Aaron        FL    red   Mango   12     120    0.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    0.0\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How does pandas know to even to trigger the warning?\n",
    "In the above example, we created **`df1`**, which when modifying the **`age`** column, triggered the warning. How did pandas know to do this? \n",
    "\n",
    "**`df1`** was created by **`df[['color', 'age']]`**. During this selection, pandas alters the **`is_copy`** or **`_is_view`** attributes.\n",
    "\n",
    "If we call **`is_copy`** like a method then we will get the object it was copied from if it is a copy or **`None`** will be returned. Let's see its value for **`df1`**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    0.0\n",
       "Aaron        FL    red   Mango   12     120    0.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    0.0\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.is_copy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The private attribute **`_is_view`** is a boolean:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1._is_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check these same attributes for **`df`**. Since **`df`** was read in directly from a csv it should not be a copy or a view."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.is_copy is None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df._is_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's find out if a single column as a Series is a view or a copy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "food = df['food']\n",
    "food.is_copy is None # not a copy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "food._is_view"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Selecting a single column returns a view and not a copy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## False Negatives with SettingWithCopy with .loc and .iloc\n",
    "Unfortunately, when using chained indexing where **`.loc`** and **`.iloc`** are used as the first indexer, the warning will not get triggered reliably. Let's take a look at an example where no warning is triggered and no change is made to the data.\n",
    "\n",
    "Let's change the ages of **`Niko`** and **`Dean`** to 99."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120    9.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Niko','Dean']]['age'] = 99\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's make a slight change and use slice notation to select all the names from **Niko** through **Dean** instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/Ted/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
      "  \"\"\"Entry point for launching an IPython kernel.\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>99</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>99</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>99</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>99</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb   99      70    8.3\n",
       "Aaron        FL    red   Mango   99     120    9.0\n",
       "Penelope     AL  white   Apple   99      80    3.3\n",
       "Dean         AK   gray  Cheese   99     180    1.8\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc['Niko':'Dean']['age'] = 99\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## WTF????\n",
    "By changing from a list to a slice within **`.loc`**, the warning is triggered AND the DataFrame is modified. This is craziness.\n",
    "\n",
    "## Some good news\n",
    "None of the stuff that we have done for **`SettingWithCopy`** needs to be memorized. Even I don't know whether a subset selection will return a view or a copy. You don't have to worry about any of that it.\n",
    "\n",
    "## Two common scenarios\n",
    "You will almost always find yourself in one of two scenarios:\n",
    "\n",
    "1. You want to work with the entire DataFrame and modify a subset of it\n",
    "1. You want to work with a subset of your original DataFrame and modify that subset\n",
    "\n",
    "Scenario 1 is solved by using exactly one indexer to make the selection and assignment.\n",
    "\n",
    "Scenario 2 is solved by forcing a copy of your subset selection with the **`copy`** method. This will allow you to make changes to this new DataFrame without modifying the original.\n",
    "\n",
    "## Scenario 1 - Working with the entire DataFrame\n",
    "When you are doing an analysis on a single DataFrame and want to work only on this DataFrame in its entirety then you are in scenario 1.\n",
    "\n",
    "For instance, if we want to change the **`color`** of all the people who live in Texas to maroon, we would do this in a single call to the **`.loc`** indexer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state   color    food  age  height  score\n",
       "Jane         NY    blue   Steak   30     165    4.6\n",
       "Niko         TX  maroon    Lamb    2      70    8.3\n",
       "Aaron        FL     red   Mango   12     120    9.0\n",
       "Penelope     AL   white   Apple    4      80    3.3\n",
       "Dean         AK    gray  Cheese   32     180    1.8\n",
       "Christina    TX  maroon   Melon   33     172    9.5\n",
       "Cornelia     TX  maroon   Beans   69     150    2.2"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)\n",
    "\n",
    "df.loc[df['state'] == 'TX', 'color'] = 'maroon'\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We add 5 to the score of **`Jane`**, **`Dean`**, and **`Cornelia`** like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>9.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>6.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>7.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state   color    food  age  height  score\n",
       "Jane         NY    blue   Steak   30     165    9.6\n",
       "Niko         TX  maroon    Lamb    2      70    8.3\n",
       "Aaron        FL     red   Mango   12     120    9.0\n",
       "Penelope     AL   white   Apple    4      80    3.3\n",
       "Dean         AK    gray  Cheese   32     180    6.8\n",
       "Christina    TX  maroon   Melon   33     172    9.5\n",
       "Cornelia     TX  maroon   Beans   69     150    7.2"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Jane', 'Dean', 'Cornelia'], 'score'] = df.loc[['Jane', 'Dean', 'Cornelia'], 'score'] + 5\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can change a single cell, such as the age of **`Aaron`** to 15:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>9.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>15</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>6.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>7.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state   color    food  age  height  score\n",
       "Jane         NY    blue   Steak   30     165    9.6\n",
       "Niko         TX  maroon    Lamb    2      70    8.3\n",
       "Aaron        FL     red   Mango   15     120    9.0\n",
       "Penelope     AL   white   Apple    4      80    3.3\n",
       "Dean         AK    gray  Cheese   32     180    6.8\n",
       "Christina    TX  maroon   Melon   33     172    9.5\n",
       "Cornelia     TX  maroon   Beans   69     150    7.2"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc['Aaron', 'age'] = 15\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All of these examples involve simultaneous selection of rows and columns, which are the most tempting to do chained indexing on."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Scenario 2\n",
    "In scenario 2, we would like to select some data from our original DataFrame and do an independent analysis on it separate from the original. For instance, let's say we wanted to select just the **`food`** and **`height`** columns into a separate DataFrame.\n",
    "\n",
    "Let's go ahead and make this selection:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>food</th>\n",
       "      <th>height</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>Steak</td>\n",
       "      <td>165</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>Lamb</td>\n",
       "      <td>70</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>Mango</td>\n",
       "      <td>120</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>Apple</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>Cheese</td>\n",
       "      <td>180</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>Melon</td>\n",
       "      <td>172</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>Beans</td>\n",
       "      <td>150</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             food  height\n",
       "Jane        Steak     165\n",
       "Niko         Lamb      70\n",
       "Aaron       Mango     120\n",
       "Penelope    Apple      80\n",
       "Dean       Cheese     180\n",
       "Christina   Melon     172\n",
       "Cornelia    Beans     150"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = df[['food', 'height']]\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we saw from above, this DataFrame is still linked to the original DataFrame. We can check the **`is_copy`** attribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>9.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>15</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>6.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>maroon</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>7.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state   color    food  age  height  score\n",
       "Jane         NY    blue   Steak   30     165    9.6\n",
       "Niko         TX  maroon    Lamb    2      70    8.3\n",
       "Aaron        FL     red   Mango   15     120    9.0\n",
       "Penelope     AL   white   Apple    4      80    3.3\n",
       "Dean         AK    gray  Cheese   32     180    6.8\n",
       "Christina    TX  maroon   Melon   33     172    9.5\n",
       "Cornelia     TX  maroon   Beans   69     150    7.2"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1.is_copy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use the copy method\n",
    "To get an independent copy, call the **`copy`** method like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1 = df[['food', 'height']].copy()\n",
    "df1.is_copy is None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**`df1`** is no longer linked to the original in any way. We can now modify one of its columns without getting the **`SettingWithCopy`** warning.\n",
    "\n",
    "Let's change the height of every person to 100:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>food</th>\n",
       "      <th>height</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>Steak</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>Lamb</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>Mango</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>Apple</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>Cheese</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>Melon</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>Beans</td>\n",
       "      <td>100</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             food  height\n",
       "Jane        Steak     100\n",
       "Niko         Lamb     100\n",
       "Aaron       Mango     100\n",
       "Penelope    Apple     100\n",
       "Dean       Cheese     100\n",
       "Christina   Melon     100\n",
       "Cornelia    Beans     100"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df1['height'] = 100\n",
    "df1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Avoiding ambiguity and complexity\n",
    "We will now move away from the **`SettingWithCopy`** warning and cover some subset selections that I tend to avoid. These subset selections that I personally find either ambiguous or adding complexity to pandas without adding any additional functionality. \n",
    "\n",
    "However, this does not mean that these subset selections are wrong. They were built into pandas for a purpose and many others make use of them and don't have a problem using them. So, it will be up to you whether or not you decide to use these upcoming subset selections.\n",
    "\n",
    "## Selecting rows with *just the indexing operator*\n",
    "The primary purpose of *just the indexing operator* is to select column(s) by passing it a string or list of strings. Unexpectedly, this operator completely changes behavior whenever you pass it a slice. For instance, we can select every other row beginning with integer location 1 to the end like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color   food  age  height  score\n",
       "Niko         TX  green   Lamb    2      70    8.3\n",
       "Penelope     AL  white  Apple    4      80    3.3\n",
       "Christina    TX  black  Melon   33     172    9.5"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[1::2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Even stranger, is that you can make selections by row label as well. For instance, we can select **`Niko`** through **`Dean`** like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         state  color    food  age  height  score\n",
       "Niko        TX  green    Lamb    2      70    8.3\n",
       "Aaron       FL    red   Mango   12     120    9.0\n",
       "Penelope    AL  white   Apple    4      80    3.3\n",
       "Dean        AK   gray  Cheese   32     180    1.8"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['Niko':'Dean']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Even more bizarre....partial string subset selection\n",
    "The most bizarre thing is that you can use partial string matches on the index when using a slice with *just the indexing operator*. For this to work, the index will need to be sorted. Call the **`sort_index`** method to do so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Aaron        FL    red   Mango   12     120    9.0\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Penelope     AL  white   Apple    4      80    3.3"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sort = df.sort_index()\n",
    "df_sort"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, you can do use slice notation with partial strings. For instance, if we wanted to select names that begin with 'C' and 'D', we would do the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2\n",
       "Dean         AK   gray  Cheese   32     180    1.8"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_sort['C':'E']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Technically this slices from the exact label 'C' to the exact label 'E'.\n",
    "\n",
    "## I never do this\n",
    "I never pass *just the indexing operator* slices like in these examples for the following reasons:\n",
    "* It is confusing for there to be multiple modes of operation for an operator - one for selecting columns and the other for selecting rows\n",
    "* It is not explicit - slicing can happen by both integer location and by label.\n",
    "\n",
    "## If I want to slice rows, I always use .loc/.iloc\n",
    "The **`.loc/.iloc`** indexers are explicit and there will be no ambiguity with them. This follows from the Zen of Python that explicit is better than implicit.\n",
    "\n",
    "## Scalar selection with `.at\\.iat`\n",
    "Both **`.at\\.iat`** are indexers that pandas has available to make a selection of one and only one single value. Each of these indexers works analogously to **`.loc\\.iloc`**.\n",
    "\n",
    "**`.at`** makes its selection only by **label** and **`.iat`** selects only by integer location. They each select a single value. The term **scalar** is used to refer to a single value.\n",
    "\n",
    "Let's take a look at an example of each. First, let's select Dean's age."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "32"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.at['Dean', 'age']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that a scalar value was returned and not a pandas Series or DataFrame.\n",
    "\n",
    "We can select the cell at the 5th row and 2nd column like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Melon'"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iat[5, 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What is the purpose of `.at\\.iat`?\n",
    "These indexers offer no additional functionality. You can select a single value with **`.loc`** or **`.iloc`**. But, what they do offer, is a performance improvement, albeit a minor one. So, if there is a performance critical part of your code that does lots of scalar selections you could use **`.at\\.iat`**.\n",
    "\n",
    "Personally, I never use them as they add needless complexity to the library for a small performance gain."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary\n",
    "* Idiomatic pandas is efficient, easy to read, and common among experts\n",
    "* It is easy to write non-idiomatic pandas - lots of ways to do the same thing\n",
    "* Getting the right answer does not guarantee that you are using pandas correctly\n",
    "* Assignment with chained indexing with a list does not work: **`a[2:6][0] = 5`** - NEVER DO THIS!\n",
    "* Assigning an entire list to a new variable does NOT create a new copy. Both variable names will refer to the same underlying object\n",
    "* Slicing a list creates a shallow copy\n",
    "* To make new copies of any mutable objects within a list, you need to do a deep copy\n",
    "* Chained indexing happens when you make two successive subset selections one directly after the other\n",
    "* Avoid chained indexing in pandas\n",
    "* Identify chained indexing - closed then open brackets **`][`** or **`.loc\\.iloc`** following a bracket like this: **`df[a].loc[b]`**\n",
    "* Another way to identify chained indexing is if you can break up the indexing into 2 lines of code.\n",
    "* Chained indexing is bad because it uses two calls to the indexers instead of one and more importantly triggers the **`SettingWithCopy`** warning when doing an assignment\n",
    "* The **`SettingWithCopy`** warning gets triggered when you make a subset selection and then try to assign new values within this selection. i.e. you have done chained indexing!\n",
    "* Chained indexing is the cause of the **`SettingWithCopy`** warning. Chained indexing can happen in the same line, consecutive lines, or two lines very far apart from each other.\n",
    "* Whenever you make a subset selection, pandas creates either a **view** or a **copy**\n",
    "* A **view** is a reference to the original DataFrame. No new data has been created.\n",
    "* A **copy** means a completely new object with new data that is unlinked from the original DataFrame\n",
    "* The reason the **`SettingWithCopy`** warning exists, is because you might be trying to make an assignment that actually fails or you might be modifying the original DataFrame without knowing it.\n",
    "* Regardless of why the **`SettingWithCopy`** warning was triggered, you should not ignore it. \n",
    "* You need to go back and understand what happened and probably rewrite your pandas so you don't trigger the warning\n",
    "* One of the most common triggers of the **`SettingWithCopy`** warning is when you do boolean selection and then try to set the values of a column like this: **`df[df['col a'] > 5]['col b'] = 10`**\n",
    "* Fix this chained assignment with **`.loc`** or **`.iloc`** like this: **`df.loc[df['col a'] > 5, 'col b'] = 10`**\n",
    "* You can use **`is_copy`** or the private attribute **`_is_view`** to determine if you have a view or a copy\n",
    "* It is not necessary to know whether you have a view or a copy to write good pandas, because you will likely be in one of two scenarios.\n",
    "* In Scenario 1, you will be working on one DataFrame and want to change values in it and continue to use it in its entirety\n",
    "* In Scenario 2, you will make a subset selection, store it to a variable, and want to continue working on it independently from the original data.\n",
    "* To avoid the **`SettingWithCopy`** warning, use **`.loc\\.iloc`** for scenario 1 and the **`copy`** method with scenario 2.\n",
    "* Personally, I avoid using *just the indexing operator* to select rows with slices because it is ambiguous.\n",
    "* I also avoid using **`.at\\.iat`** because it adds unneeded complexity and adds no increased functionality."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 1\n",
    "<span  style=\"color:green; font-size:16px\"> Create a ten item list and then use chained indexing to select the items from 2 to the end and then from this subset, the first three items.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 2\n",
    "<span  style=\"color:green; font-size:16px\"> Get the same result from example 1 with just a single call to the indexing operator.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use the following DataFrame for the next several questions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "      <th>height</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>blue</td>\n",
       "      <td>Steak</td>\n",
       "      <td>30</td>\n",
       "      <td>165</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>green</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>2</td>\n",
       "      <td>70</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "      <td>120</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "      <td>80</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "      <td>180</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>black</td>\n",
       "      <td>Melon</td>\n",
       "      <td>33</td>\n",
       "      <td>172</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>red</td>\n",
       "      <td>Beans</td>\n",
       "      <td>69</td>\n",
       "      <td>150</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state  color    food  age  height  score\n",
       "Jane         NY   blue   Steak   30     165    4.6\n",
       "Niko         TX  green    Lamb    2      70    8.3\n",
       "Aaron        FL    red   Mango   12     120    9.0\n",
       "Penelope     AL  white   Apple    4      80    3.3\n",
       "Dean         AK   gray  Cheese   32     180    1.8\n",
       "Christina    TX  black   Melon   33     172    9.5\n",
       "Cornelia     TX    red   Beans   69     150    2.2"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.read_csv('../../data/sample_data.csv', index_col=0)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 3\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>food</th>\n",
       "      <th>color</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>Steak</td>\n",
       "      <td>blue</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>green</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>FL</td>\n",
       "      <td>Mango</td>\n",
       "      <td>red</td>\n",
       "      <td>9.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>Apple</td>\n",
       "      <td>white</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>gray</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Christina</th>\n",
       "      <td>TX</td>\n",
       "      <td>Melon</td>\n",
       "      <td>black</td>\n",
       "      <td>9.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>Beans</td>\n",
       "      <td>red</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          state    food  color  score\n",
       "Jane         NY   Steak   blue    4.6\n",
       "Niko         TX    Lamb  green    8.3\n",
       "Aaron        FL   Mango    red    9.0\n",
       "Penelope     AL   Apple  white    3.3\n",
       "Dean         AK  Cheese   gray    1.8\n",
       "Christina    TX   Melon  black    9.5\n",
       "Cornelia     TX   Beans    red    2.2"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[['state', 'food', 'color', 'score']]\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 4\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>food</th>\n",
       "      <th>color</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>green</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>Apple</td>\n",
       "      <td>white</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>gray</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>Beans</td>\n",
       "      <td>red</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>Steak</td>\n",
       "      <td>blue</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         state    food  color  score\n",
       "Niko        TX    Lamb  green    8.3\n",
       "Penelope    AL   Apple  white    3.3\n",
       "Dean        AK  Cheese   gray    1.8\n",
       "Cornelia    TX   Beans    red    2.2\n",
       "Jane        NY   Steak   blue    4.6"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Niko', 'Penelope', 'Dean', 'Cornelia', 'Jane'], ['state', 'food', 'color', 'score']]\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 5\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>state</th>\n",
       "      <th>food</th>\n",
       "      <th>color</th>\n",
       "      <th>score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Niko</th>\n",
       "      <td>TX</td>\n",
       "      <td>Lamb</td>\n",
       "      <td>green</td>\n",
       "      <td>8.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>AL</td>\n",
       "      <td>Apple</td>\n",
       "      <td>white</td>\n",
       "      <td>3.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>AK</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>gray</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Cornelia</th>\n",
       "      <td>TX</td>\n",
       "      <td>Beans</td>\n",
       "      <td>red</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>NY</td>\n",
       "      <td>Steak</td>\n",
       "      <td>blue</td>\n",
       "      <td>4.6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         state    food  color  score\n",
       "Niko        TX    Lamb  green    8.3\n",
       "Penelope    AL   Apple  white    3.3\n",
       "Dean        AK  Cheese   gray    1.8\n",
       "Cornelia    TX   Beans    red    2.2\n",
       "Jane        NY   Steak   blue    4.6"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[['Niko', 'Penelope', 'Dean', 'Cornelia', 'Jane']][['state', 'food', 'color', 'score']]\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 6\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>color</th>\n",
       "      <th>food</th>\n",
       "      <th>age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>red</td>\n",
       "      <td>Mango</td>\n",
       "      <td>12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Penelope</th>\n",
       "      <td>white</td>\n",
       "      <td>Apple</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Dean</th>\n",
       "      <td>gray</td>\n",
       "      <td>Cheese</td>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          color    food  age\n",
       "Aaron       red   Mango   12\n",
       "Penelope  white   Apple    4\n",
       "Dean       gray  Cheese   32"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.iloc[:5].iloc[2:, 1:4]\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 7\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Niko          2\n",
       "Christina    33\n",
       "Cornelia     69\n",
       "Name: age, dtype: int64"
      ]
     },
     "execution_count": 88,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[df['state'] == 'TX']['age']\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 8\n",
    "<span  style=\"color:green; font-size:16px\">Determine whether the following line of code is chained indexing. If it is, rewrite it so that it is not.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>age</th>\n",
       "      <th>state</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Jane</th>\n",
       "      <td>30</td>\n",
       "      <td>NY</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Aaron</th>\n",
       "      <td>12</td>\n",
       "      <td>FL</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       age state\n",
       "Jane    30    NY\n",
       "Aaron   12    FL"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[['state', 'food', 'age', 'height']][['state', 'age']].loc[['Jane', 'Aaron'], ['age', 'state']]\n",
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 9\n",
    "<span  style=\"color:green; font-size:16px\">Change the score for all people who are over 30 years of age to 99 without doing chained indexing.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 10\n",
    "<span  style=\"color:green; font-size:16px\">Select the **`color`** and **`food`** columns into their own variable.  Then write one more line of code that will trigger the **`SettingWithCopy`** warning.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 11\n",
    "<span  style=\"color:green; font-size:16px\">Select the **`color`** and **`food`** columns into their own variable like you did in exercise 10.  Do the same operation as you did without getting the warning.</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise 12\n",
    "<span  style=\"color:green; font-size:16px\">Change the values of **`color`**, **`age`**, and **`score`** for **`Niko`** to anything you like</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Your code here"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
